1 Overview

In this hands-on exercise, you will learn how to handle geospatial data in R by using appropriate R packages.

1.1 Learning Outcome

By the end of this hands-on exercise, you should acquire the following competencies:

  • importing geospatial and asaptial data by using appropriate functions of sf packages,
  • assigning or transforming a geospatial data from one coordinates system to another coordinates system bu using appropriate functions of sf package,
  • performing geoprocessing (also known as GIS analysis) by using appropriate functions of sf package,
  • optionally, performing the above geospatial data handling, transformation and geoprocessing tasks using appropriate functions of sp, rgdal, and rgeos packages.

1.2 Data Acquisition

Before you can start using R, you are required to extract the necessary data sets from the appropriate source

1.3 Getting Started

Before we get started, it is important for us to ensure that all the R packages we need have been installed.

  • Using the steps you had learned earlier, check if sf, sp, rgdal, rgeos and tidyverse have been installed, if not, then install the uninstalled package. After the installation is completed, launch sf, sp, rgdal, rgeos and tidyverse packages.

The code chunk:

packages = c('sp', 'rgdal', 'rgeos', 'sf', 'tidyverse')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

2 Working with sf package

In this hands-on exercise, you are required to import the following geospatial data into R:

  • MP14_SUBZONE_WEB_PL - a polygon feature layer in ESRI shapefile format,
  • CyclingPath - a line feature layer in ESRI shapefile format, and
  • PreSchool - a point feature layer in kml file format.

2.1 Importing Geospatial Data by using st_read()

In this section, you will learn how to import geospatial data in ESRI shapefile and Google’s KML formats into R as simple feature data.frame.

Before getting started, you are encouraged to read 2. Reading, Writing and Converting Simple Features

2.1.1 Importing a polygon feature data in shapefile format

The code chunk below uses st_read() function of sf package to import MP14_SUBZONE_WEB_PL data into R as simple feature data.frame.

sf_mpsz = st_read(dsn = "data/geospatial", 
                  layer = "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## Projected CRS: SVY21

2.1.2 Importing a polyline feature data in shapefile format

The code chunk below uses st_read() function of sf package to import the CyclingPath layer into R as simple feature data.frame.

sf_cyclingpath = st_read(dsn = "data/geospatial", 
                         layer = "CyclingPath")
## Reading layer `CyclingPath' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 1625 features and 2 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 12711.19 ymin: 28711.33 xmax: 42626.09 ymax: 48948.15
## Projected CRS: SVY21

2.1.3 Importing a point feature data in kml format

The code chunk below uses st_read() function of sf package to import pre-school-location-kml layer into R as simple feature data.frame.

sf_preschool = st_read("data/geospatial/pre-schools-location-kml.kml")
## Reading layer `PRESCHOOLS_LOCATION' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial\pre-schools-location-kml.kml' using driver `KML'
## Simple feature collection with 1359 features and 2 fields
## Geometry type: POINT
## Dimension:     XYZ
## Bounding box:  xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
## z_range:       zmin: 0 zmax: 0
## Geodetic CRS:  WGS 84

Notice that sf_preschool simple features data.frame is in wgs84 coordinates system. In Section 2.3, you will learn how to transform the data.frame into svy21 projected coordinates systems.

2.2 Checking the contents of a simple features data.frame object

Next, let us examine the structure of the newly created simple feature data.frame. There are at least two ways you can used to examine the structure of a simple feature data.frame.

First, we can view the structure of the simple feature data.frame by using the Environment of RStudio. This is the most handy way. Alternatively, the glimpse() can be used display the structure of the newly created simple feature data.frame.

glimpse(sf_mpsz)
## Rows: 323
## Columns: 16
## $ OBJECTID   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ~
## $ SUBZONE_NO <int> 1, 1, 3, 8, 3, 7, 9, 2, 13, 7, 12, 6, 1, 5, 1, 1, 3, 2, 2, ~
## $ SUBZONE_N  <chr> "MARINA SOUTH", "PEARL'S HILL", "BOAT QUAY", "HENDERSON HIL~
## $ SUBZONE_C  <chr> "MSSZ01", "OTSZ01", "SRSZ03", "BMSZ08", "BMSZ03", "BMSZ07",~
## $ CA_IND     <chr> "Y", "Y", "Y", "N", "N", "N", "N", "Y", "N", "N", "N", "N",~
## $ PLN_AREA_N <chr> "MARINA SOUTH", "OUTRAM", "SINGAPORE RIVER", "BUKIT MERAH",~
## $ PLN_AREA_C <chr> "MS", "OT", "SR", "BM", "BM", "BM", "BM", "SR", "QT", "QT",~
## $ REGION_N   <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "CENT~
## $ REGION_C   <chr> "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR",~
## $ INC_CRC    <chr> "5ED7EB253F99252E", "8C7149B9EB32EEFC", "C35FEFF02B13E0E5",~
## $ FMEL_UPD_D <date> 2014-12-05, 2014-12-05, 2014-12-05, 2014-12-05, 2014-12-05~
## $ X_ADDR     <dbl> 31595.84, 28679.06, 29654.96, 26782.83, 26201.96, 25358.82,~
## $ Y_ADDR     <dbl> 29220.19, 29782.05, 29974.66, 29933.77, 30005.70, 29991.38,~
## $ SHAPE_Leng <dbl> 5267.381, 3506.107, 1740.926, 3313.625, 2825.594, 4428.913,~
## $ SHAPE_Area <dbl> 1630379.27, 559816.25, 160807.50, 595428.89, 387429.44, 103~
## $ geometry   <MULTIPOLYGON [m]> MULTIPOLYGON (((31495.56 30..., MULTIPOLYGON (~

Notice that the last column of a simple feature data.frame is always called geometry. It is known as simple feature list-column (an object of class sfc (refer to the Topic 2 slides for more discussion.)

You can also check the contents of sf_mpsz data.frame by using summary().

The code chunk:

summary(sf_mpsz)
##     OBJECTID       SUBZONE_NO      SUBZONE_N          SUBZONE_C        
##  Min.   :  1.0   Min.   : 1.000   Length:323         Length:323        
##  1st Qu.: 81.5   1st Qu.: 2.000   Class :character   Class :character  
##  Median :162.0   Median : 4.000   Mode  :character   Mode  :character  
##  Mean   :162.0   Mean   : 4.625                                        
##  3rd Qu.:242.5   3rd Qu.: 6.500                                        
##  Max.   :323.0   Max.   :17.000                                        
##     CA_IND           PLN_AREA_N         PLN_AREA_C          REGION_N        
##  Length:323         Length:323         Length:323         Length:323        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    REGION_C           INC_CRC            FMEL_UPD_D             X_ADDR     
##  Length:323         Length:323         Min.   :2014-12-05   Min.   : 5093  
##  Class :character   Class :character   1st Qu.:2014-12-05   1st Qu.:21864  
##  Mode  :character   Mode  :character   Median :2014-12-05   Median :28465  
##                                        Mean   :2014-12-05   Mean   :27257  
##                                        3rd Qu.:2014-12-05   3rd Qu.:31674  
##                                        Max.   :2014-12-05   Max.   :50425  
##      Y_ADDR        SHAPE_Leng        SHAPE_Area                geometry  
##  Min.   :19579   Min.   :  871.5   Min.   :   39438   MULTIPOLYGON :323  
##  1st Qu.:31776   1st Qu.: 3709.6   1st Qu.:  628261   epsg:NA      :  0  
##  Median :35113   Median : 5211.9   Median : 1229894   +proj=tmer...:  0  
##  Mean   :36106   Mean   : 6524.4   Mean   : 2420882                      
##  3rd Qu.:39869   3rd Qu.: 6942.6   3rd Qu.: 2106483                      
##  Max.   :49553   Max.   :68083.9   Max.   :69748299

Lastly the head() can be used to list the first few records in the data.frame by using the code chunk below.

head(sf_mpsz, n=4)  
## Simple feature collection with 4 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 26403.48 ymin: 28369.47 xmax: 32362.39 ymax: 30396.46
## Projected CRS: SVY21
##   OBJECTID SUBZONE_NO      SUBZONE_N SUBZONE_C CA_IND      PLN_AREA_N
## 1        1          1   MARINA SOUTH    MSSZ01      Y    MARINA SOUTH
## 2        2          1   PEARL'S HILL    OTSZ01      Y          OUTRAM
## 3        3          3      BOAT QUAY    SRSZ03      Y SINGAPORE RIVER
## 4        4          8 HENDERSON HILL    BMSZ08      N     BUKIT MERAH
##   PLN_AREA_C       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
## 1         MS CENTRAL REGION       CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2         OT CENTRAL REGION       CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3         SR CENTRAL REGION       CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4         BM CENTRAL REGION       CR 3775D82C5DDBEFBD 2014-12-05 26782.83
##     Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
## 1 29220.19   5267.381  1630379.3 MULTIPOLYGON (((31495.56 30...
## 2 29782.05   3506.107   559816.2 MULTIPOLYGON (((29092.28 30...
## 3 29974.66   1740.926   160807.5 MULTIPOLYGON (((29932.33 29...
## 4 29933.77   3313.625   595428.9 MULTIPOLYGON (((27131.28 30...

2.3 Working with Projection

In this section, you will learn how to work with projection by using appropriate functions of sf package.

2.3.1 Assigning projection

In this section, you will learn how to assign EPSG code to sf_mpsz simple features data.frame.

First, checking the projection of sf_mpsz by using st_crs() by using the code chunk below.

st_crs(sf_mpsz)
## Coordinate Reference System:
##   User input: SVY21 
##   wkt:
## PROJCRS["SVY21",
##     BASEGEOGCRS["SVY21[WGS84]",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]],
##             ID["EPSG",6326]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["Degree",0.0174532925199433]]],
##     CONVERSION["unnamed",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1,
##                 ID["EPSG",9001]]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1,
##                 ID["EPSG",9001]]]]

Next, assigning EPSG 3414 to sf_mpsz simple features data.frame by using st_set_crs().

sf_mpsz3414 <- st_set_crs(sf_mpsz, 3414)

Lets check the CSR again.

st_crs(sf_mpsz3414)
## Coordinate Reference System:
##   User input: EPSG:3414 
##   wkt:
## PROJCRS["SVY21 / Singapore TM",
##     BASEGEOGCRS["SVY21",
##         DATUM["SVY21",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4757]],
##     CONVERSION["Singapore Transverse Mercator",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Cadastre, engineering survey, topographic mapping."],
##         AREA["Singapore - onshore and offshore."],
##         BBOX[1.13,103.59,1.47,104.07]],
##     ID["EPSG",3414]]

Notice that sf_mpsz3414 simple features data.frame is in EPSG: 3414 now.

2.3.2 Transforming the projection of sf_preschool from wgs84 to svy21.

In Section 2.1.3, we had revealed that sf_preschool simple features data.frame is in wgs84 geographic coordinates system.

st_crs(sf_preschool)
## Coordinate Reference System:
##   User input: WGS 84 
##   wkt:
## GEOGCRS["WGS 84",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["geodetic latitude (Lat)",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["geodetic longitude (Lon)",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4326]]

Next, we will transform sf_preschool simple features data.frame onto svy21 projected coordinate system (i.e. EPSG 3414) by using st_transform().

sf_preschool3414 <- st_transform(sf_preschool, 3414)
st_crs(sf_preschool3414)
## Coordinate Reference System:
##   User input: EPSG:3414 
##   wkt:
## PROJCRS["SVY21 / Singapore TM",
##     BASEGEOGCRS["SVY21",
##         DATUM["SVY21",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4757]],
##     CONVERSION["Singapore Transverse Mercator",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Cadastre, engineering survey, topographic mapping."],
##         AREA["Singapore - onshore and offshore."],
##         BBOX[1.13,103.59,1.47,104.07]],
##     ID["EPSG",3414]]

2.4 Importing and Converting Aspatial Data into simple features

In this section, you will learn how to import an aspatial data (i.e. Singapore Airbnb listings.csv) into R as a tibble data.frame. Then, convert the tibble data.frame into a simple features data.frame by using its x-coordinates and y-coordinates columns.

2.4.1 Importing the aspatial data

In the code chunk below read_csv() of readr package is used to parse listing.csv into R as a tibble data.frame.

listings <- read_csv("data/aspatial/listings.csv")
## 
## -- Column specification --------------------------------------------------------
## cols(
##   id = col_double(),
##   name = col_character(),
##   host_id = col_double(),
##   host_name = col_character(),
##   neighbourhood_group = col_character(),
##   neighbourhood = col_character(),
##   latitude = col_double(),
##   longitude = col_double(),
##   room_type = col_character(),
##   price = col_double(),
##   minimum_nights = col_double(),
##   number_of_reviews = col_double(),
##   last_review = col_date(format = ""),
##   reviews_per_month = col_double(),
##   calculated_host_listings_count = col_double(),
##   availability_365 = col_double()
## )
  • After importing the data file into R, it is important for us to review the data object.

2.4.2 Creating a sf data.frame

The code chunk below converts listings data frame into a simple feature data frame by using st_as_sf() of sf packages

Things to learn from the arguments:

  • The coords argument requires you to provide the column name of the x-coordinates first then followed by the column name of the y-coordinates.
  • The crs argument required you to provide the coordinates system in epsg format. EPSG: 3414 is Singapore SVY21 Projected Coordinate System. You can search for other country’s epsg code by refering to epsg.io.
listing_sf <- st_as_sf(listings, 
                       coords = c("longitude", "latitude"),
                       crs= 4326)
glimpse(listing_sf)
## Rows: 4,388
## Columns: 15
## $ id                             <dbl> 49091, 50646, 56334, 71609, 71896, 7190~
## $ name                           <chr> "COZICOMFORT LONG TERM STAY ROOM 2", "P~
## $ host_id                        <dbl> 266763, 227796, 266763, 367042, 367042,~
## $ host_name                      <chr> "Francesca", "Sujatha", "Francesca", "B~
## $ neighbourhood_group            <chr> "North Region", "Central Region", "Nort~
## $ neighbourhood                  <chr> "Woodlands", "Bukit Timah", "Woodlands"~
## $ room_type                      <chr> "Private room", "Private room", "Privat~
## $ price                          <dbl> 81, 80, 67, 177, 81, 81, 206, 52, 40, 7~
## $ minimum_nights                 <dbl> 180, 90, 6, 90, 90, 90, 1, 14, 14, 90, ~
## $ number_of_reviews              <dbl> 1, 18, 20, 20, 24, 48, 29, 20, 13, 133,~
## $ last_review                    <date> 2013-10-21, 2014-12-26, 2015-10-01, 20~
## $ reviews_per_month              <dbl> 0.01, 0.21, 0.17, 0.18, 0.20, 0.40, 0.2~
## $ calculated_host_listings_count <dbl> 2, 1, 2, 5, 5, 5, 5, 47, 47, 7, 1, 47, ~
## $ availability_365               <dbl> 365, 365, 365, 365, 1, 365, 181, 350, 0~
## $ geometry                       <POINT [°]> POINT (103.7958 1.44255), POINT (~

2.4.3 Trasforming Projection

Next, we will transform the listing simple feature from wgs84 geographic coordinates systems to svy21 projected coordinates system by using st_transform()

listing_sf <- st_transform(listing_sf, 3414)
glimpse(listing_sf)
## Rows: 4,388
## Columns: 15
## $ id                             <dbl> 49091, 50646, 56334, 71609, 71896, 7190~
## $ name                           <chr> "COZICOMFORT LONG TERM STAY ROOM 2", "P~
## $ host_id                        <dbl> 266763, 227796, 266763, 367042, 367042,~
## $ host_name                      <chr> "Francesca", "Sujatha", "Francesca", "B~
## $ neighbourhood_group            <chr> "North Region", "Central Region", "Nort~
## $ neighbourhood                  <chr> "Woodlands", "Bukit Timah", "Woodlands"~
## $ room_type                      <chr> "Private room", "Private room", "Privat~
## $ price                          <dbl> 81, 80, 67, 177, 81, 81, 206, 52, 40, 7~
## $ minimum_nights                 <dbl> 180, 90, 6, 90, 90, 90, 1, 14, 14, 90, ~
## $ number_of_reviews              <dbl> 1, 18, 20, 20, 24, 48, 29, 20, 13, 133,~
## $ last_review                    <date> 2013-10-21, 2014-12-26, 2015-10-01, 20~
## $ reviews_per_month              <dbl> 0.01, 0.21, 0.17, 0.18, 0.20, 0.40, 0.2~
## $ calculated_host_listings_count <dbl> 2, 1, 2, 5, 5, 5, 5, 47, 47, 7, 1, 47, ~
## $ availability_365               <dbl> 365, 365, 365, 365, 1, 365, 181, 350, 0~
## $ geometry                       <POINT [m]> POINT (23824.77 47135.4), POINT (~

2.5 Plotting the spatial data

To view the spatial data, plot() function of sf package can be used.

plot(sf_mpsz)

2.6 Conversion to sp class

Although simple feature data.frame is gaining popularity again sp’s Spatial* classes, there are, however, many geospatial analysis packages require the input geospatial data in sp’s Spatial* classes. In this section, you will learn how to convert simple feature data.frame to sp’s Spatial* class.

2.6.1 Converting a point features data.frame to SpatialPointsDataFrame

The code chunk below uses as_Spatial() of sf package to convert sf_preschool3414 simple feature data.frame to sp’s Spatial* class.

sp_preschool <- as_Spatial(sf_preschool3414)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj
## = prefer_proj): Discarded datum Unknown based on WGS84 ellipsoid in Proj4
## definition
## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum SVY21 in Proj4 definition

Notice that the output is a SpatialPointsDataFrame class.

You can check the content of the SpatialPointsDataFrame by using summary() as shown in the code chunk below.

summary(sp_preschool)
## Object of class SpatialPointsDataFrame
## Coordinates:
##                min      max
## coords.x1 11203.01 45404.24
## coords.x2 25667.60 49300.88
## coords.x3     0.00     0.00
## Is projected: TRUE 
## proj4string :
## [+proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs]
## Number of points: 1359
## Data attributes:
##      Name           Description       
##  Length:1359        Length:1359       
##  Class :character   Class :character  
##  Mode  :character   Mode  :character

DIY: Using the steps you had learned, convert sf_mpsz3414 and sf_mpsz simple feature data.frame to sp’s Spatial* classes. After the conversion, examine the output spatial classes carefully. Write short notes to decribe your onservation of the output spatial classes.

2.7 Geoprocessing with sf package

Beside providing functions to handle and wrangle geospatial data, sf package also provides functions to perform geoprocessing tasks list most GIS toolkits provide.

In this section, you will learn how to perform two popularly GIS geoprocessing tasks, namely: buffering and point-in-polygon count by using sf package.

2.7.1 Buffering

The scenario:

The authority is planning to upgrade the exiting cycling path. To do so, they need to acquire 5 metres reserve land on the both sides of the current cycling path. You are tasked to determine the extend of the land need to be acquired and their total areas.

The solution:

Creating 5-meter buffers around cycling path by using st_buffer() and calculate the total area of the buffers by using st_area().

sf_buffer_cycling <- st_buffer(sf_cyclingpath, 
                               dist=5, nQuadSegs = 30)
sf_buffer_cycling$AREA <- st_area(sf_buffer_cycling)
sum(sf_buffer_cycling$AREA)
## 773143.9 [m^2]

Because the output is in tibble data.table format, you can plot the area easily by using geom_histogram() of ggplot2.

ggplot(data = sf_buffer_cycling,
       aes(x=as.numeric(AREA))) +
         geom_histogram(bins=30, 
                        color="black",
                        fill="light blue")

2.7.2 Point-in-polygon count

The scenario:

A pre-school services group want to find out numbers of pre-school in each Planning Subzone.

The solution:

The code chunk below first identify pre-schools located inside each Planning Subzone by using st_intersects(). Then, the length() is used to calculate numbers of pre-school fall inside each planning subzone.

sf_mpsz3414$`PreSch Count`<- lengths(st_intersects(sf_mpsz3414, sf_preschool3414))

Warning: You should not confuse with st_intersection().

You can check the summary statistics of the newly derived PreSch Count field by using summary() as shown in the code chunk below.

summary(sf_mpsz3414$`PreSch Count`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   2.000   4.207   6.000  37.000

To list the planning subzone with the most number of of pre-school, the top_n() of dplyr package is used as shown in the code chunk below.

top_n(sf_mpsz3414, 1, `PreSch Count`)
## Simple feature collection with 1 feature and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 23449.05 ymin: 46001.23 xmax: 25594.22 ymax: 47996.47
## Projected CRS: SVY21 / Singapore TM
##   OBJECTID SUBZONE_NO      SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N PLN_AREA_C
## 1      290          3 WOODLANDS EAST    WDSZ03      N  WOODLANDS         WD
##       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR   Y_ADDR
## 1 NORTH REGION       NR C90769E43EE6B0F2 2014-12-05 24506.64 46991.63
##   SHAPE_Leng SHAPE_Area                       geometry PreSch Count
## 1   6603.608    2553464 MULTIPOLYGON (((24786.75 46...           37

Quiz: Calculate the density of pre-school by planning subzone. With the help of appropriate graphical method, describe the distribution of the newly derived variable.

The code chunk below uses st_area() of sf package to derive the area of each planning subzone.

sf_mpsz3414$Area <- sf_mpsz3414 %>%
  st_area()
sf_mpsz3414 <- sf_mpsz3414 %>%
  mutate(`PreSch Density` = `PreSch Count`/Area * 1000000)
ggplot(data=sf_mpsz3414, 
       aes(x= as.numeric(`PreSch Density`)))+
  geom_histogram(bins=20, 
                 color="black", fill="light blue")

ggplot(data=sf_mpsz3414, 
       aes(y = `PreSch Count`, x= as.numeric(`PreSch Density`)))+
  geom_point(color="black", fill="light blue")

3 Working with sp, gdal and rgeos Packages (Optional)

In this section, you will learn how to handle geospatial data in shapefile format using sp, gdal and rgeos packages in R.

3.1 Importing a shapefile

In this section, you will learn how to import MP14_SUBZONE_WEB_PL GIS layer into R. It is stored in shapefile format. The spatial data model of this GIS data are polygon objects.

To import the GIS data layer into R, readOGR() from rgdal package will be used.

The data importing task is performed by using the code chunk below:

mpsz_sp <- readOGR(dsn = "data/geospatial", 
                   layer = "MP14_SUBZONE_WEB_PL") 
## OGR data source with driver: ESRI Shapefile 
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial", layer: "MP14_SUBZONE_WEB_PL"
## with 323 features
## It has 15 fields

Notice that mpsz_sp is in SpatialPolygonDataFrame.

3.1.1 Checking the contents of a SpatialPolygonDataFrame

You can check the contents of mpsz_sp data object by using summary().

The code chunk:

summary(mpsz_sp)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
##         min      max
## x  2667.538 56396.44
## y 15748.721 50256.33
## Is projected: TRUE 
## proj4string :
## [+proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs]
## Data attributes:
##     OBJECTID       SUBZONE_NO      SUBZONE_N          SUBZONE_C        
##  Min.   :  1.0   Min.   : 1.000   Length:323         Length:323        
##  1st Qu.: 81.5   1st Qu.: 2.000   Class :character   Class :character  
##  Median :162.0   Median : 4.000   Mode  :character   Mode  :character  
##  Mean   :162.0   Mean   : 4.625                                        
##  3rd Qu.:242.5   3rd Qu.: 6.500                                        
##  Max.   :323.0   Max.   :17.000                                        
##     CA_IND           PLN_AREA_N         PLN_AREA_C          REGION_N        
##  Length:323         Length:323         Length:323         Length:323        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    REGION_C           INC_CRC           FMEL_UPD_D            X_ADDR     
##  Length:323         Length:323         Length:323         Min.   : 5093  
##  Class :character   Class :character   Class :character   1st Qu.:21864  
##  Mode  :character   Mode  :character   Mode  :character   Median :28465  
##                                                           Mean   :27257  
##                                                           3rd Qu.:31674  
##                                                           Max.   :50425  
##      Y_ADDR        SHAPE_Leng        SHAPE_Area      
##  Min.   :19579   Min.   :  871.5   Min.   :   39438  
##  1st Qu.:31776   1st Qu.: 3709.6   1st Qu.:  628261  
##  Median :35113   Median : 5211.9   Median : 1229894  
##  Mean   :36106   Mean   : 6524.4   Mean   : 2420882  
##  3rd Qu.:39869   3rd Qu.: 6942.6   3rd Qu.: 2106483  
##  Max.   :49553   Max.   :68083.9   Max.   :69748299

Let’s view the first few records in the mpsz_sp.

The code chunk

head(mpsz_sp, n=4)  

3.1.2 Plotting the sptial data

To view the spatial data, plot() of Base R can be used.

The code chunk:

plot(mpsz_sp)

3.2 Now It’s Your Turn

Using the functions you had learned, import the Pre-School and Cycling Path GIS data files into R spatial objects.

The solution:

The pre-schools GIS data is in kml format. Before we can import the data file into R, we will use ogrListLayers function of rgdal package to check the actual data structure of the kml data file.

ogrListLayers("data/geospatial/pre-schools-location-kml.kml")
## [1] "PRESCHOOLS_LOCATION"
## attr(,"driver")
## [1] "KML"
## attr(,"nlayers")
## [1] 1

3.2.1 Importing kml GIS data

Notice that the file called pre-schools-location-kml is just the folder (refer to the list above). In order to important the layer, we need to use PRESCHOOL_LOCATION layer instead.

The code chunk below will do the trick.

preschool <- readOGR("data/geospatial/pre-schools-location-kml.kml",
                     "PRESCHOOLS_LOCATION")
## OGR data source with driver: KML 
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial\pre-schools-location-kml.kml", layer: "PRESCHOOLS_LOCATION"
## with 1359 features
## It has 2 fields

3.2.2 Importing GIS data layer from LTADataMall

In this section, you will learn how to import a line geospatial data into R. The geospatial data is the CyclingPath shapefile from LTA DataMall (https://www.mytransport.sg/content/mytransport/home/dataMall.html)

cyclingpath <- readOGR (dsn = "data/geospatial", 
                        layer = "CyclingPath")
## OGR data source with driver: ESRI Shapefile 
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial", layer: "CyclingPath"
## with 1625 features
## It has 2 fields
  • Show the codes you used to check the contents of preschool and cyclingpath spatial objects.
  • Describe their spatial data models, boundary coordinates, projection, and attribute variables.

3.2.3 Assigning a coordinate system

  • Use CRS and spTransform functions of rgdal
mpsz_svy21 <- spTransform(mpsz_sp, 
                          CRS("+init=epsg:3414"))

3.3 Reprojecting a geospatial data

Now, it is your turn to change the projection system of the preschool data set from wgs84 to svy21.

The solution:

preschool_svy21 <- spTransform(preschool, 
                          CRS("+init=epsg:3414"))

3.4 Geoprocessing with rgeos

The scenario

The authority is planning to upgrade the exiting cycling path. To do so, they need to acquire 5 metres reserve land on the both sides of the current cycling path. You are tasked to determine the extend of the land need to be acquired and their total areas.

The solution:

buf_cyclingpath <- gBuffer(cyclingpath, width = 5)

The solution:

buf_cyclingpath <- gBuffer(cyclingpath, byid = TRUE,  
                           width = 5)

The solution:

buf_cyclingpath@data$Area <- gArea(buf_cyclingpath, 
                                   byid = TRUE)
sum(buf_cyclingpath@data$Area)
## [1] 771024.9